Strategies for Dealing with Real World Classification Problems
نویسندگان
چکیده
..................................................................................................................................... 4 1.1 Motivation ................................................................................................................... 5 1.2 Thesis Overview .......................................................................................................... 7 2 Data mining: Concepts and Definitions ........................................................................... 10 2.1 Data Mining as Process ............................................................................................. 11 2.2 Knowledge Extraction Tasks .................................................................................... 12 2.3 Classification Methods .............................................................................................. 13 2.4 Model Evaluation ...................................................................................................... 14 2.4.1 Evaluation Tactics .............................................................................................. 15 2.4.2 Metrics ............................................................................................................... 15 3 Handling Incomplete Data ............................................................................................... 18 3.1 Problem Statement .................................................................................................... 18 3.2 Methods for Dealing with Missing Data ................................................................... 19 3.2.1 Filter-based MDTs ............................................................................................. 19 3.2.2 Imputation-based MDTs .................................................................................... 20 3.2.3 Embedded MDTs ............................................................................................... 21 3.3 A New Method for Data Imputation ......................................................................... 22 3.3.1 Method Description ........................................................................................... 22 3.3.2 Experimental Evaluation .................................................................................... 23 3.4 Conclusions on Data Imputation ............................................................................... 24 4 Feature Selection .............................................................................................................. 27 4.1 Problem Statement .................................................................................................... 27 4.2 Feature Selection Techniques.................................................................................... 28 4.2.1 Search Strategies ................................................................................................ 29 4.2.2 Evaluation Measures .......................................................................................... 29 4.2.3 Filter Methods .................................................................................................... 31 4.2.4 Wrapper Methods............................................................................................... 32 4.3 Combining Generation Strategies ............................................................................. 33 4.4 Experimental Evaluation ........................................................................................... 34 4.4.1 Evaluating the Wrapper Methodology ............................................................... 34 4.4.2 Evaluating the Combination Strategy ................................................................ 37 4.5 Conclusions on Feature Selection ............................................................................. 39
منابع مشابه
Robust inter and intra-cell layouts design model dealing with stochastic dynamic problems
In this paper, a novel quadratic assignment-based mathematical model is developed for concurrent design of robust inter and intra-cell layouts in dynamic stochastic environments of manufacturing systems. In the proposed model, in addition to considering time value of money, the product demands are presumed to be dependent normally distributed random variables with known expectation, variance, a...
متن کاملHierarchical Facility Location and Hub Network Problems: A literature review
In this paper, a complete review of published researches about hierarchical facility location and hub network problems is presented. Hierarchical network is a system where facilities with different service levels interact in a top-down way or vice versa. In Hierarchical systems, service levels are composed of different facilities. Published papers from (1970) to (2015) have been studied and a c...
متن کاملOn the use of Heronian means in a similarity classifier
This paper introduces new similarity classifiers using the Heronian mean, and the generalized Heronian mean operators. We examine the use of these operators at the aggregation step within the similarity classifier. The similarity classifier was earlier studied with other operators, in particular with an arithmetic mean, generalized mean, OWA operators, and many more. The two classifiers here ar...
متن کاملAn Improved Fuzzy Neural Network for Solving Uncertainty in Pattern Classification and Identification
Dealing with uncertainty is one of the most critical problems in complicatedpattern recognition subjects. In this paper, we modify the structure of a useful UnsupervisedFuzzy Neural Network (UFNN) of Kwan and Cai, and compose a new FNN with 6 types offuzzy neurons and its associated self organizing supervised learning algorithm. Thisimproved five-layer feed forward Supervised Fuzzy Neural Netwo...
متن کاملReliability Optimization for Complicated Systems with a Choice of Redundancy Strategies (TECHNICAL NOTE)
Redundancy allocation is one of the common techniques to increase the reliability of the bridge systems. Many studies on the general redundancy allocation problems assume that the redundancy strategy for each subsystem is predetermined and fixed. In general, active redundancy has received more attention in the past. However, in real world, a particular system design contains both active and col...
متن کامل